Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Added support for Claude 3+ Chat API in Bedrock #2870

Merged
merged 6 commits into from
Jan 17, 2025

Conversation

RyanKadri
Copy link
Contributor

@RyanKadri RyanKadri commented Jan 10, 2025

Description

The current version of the Bedrock instrumentation is mostly geared toward supporting the Claude Text Completions API but may not split messages as expected in multi-turn conversations with the Messages API.

The Text Completions API supports a single prompt message but the Messages API, lets you provide multiple distinct messages (context and prompt) to the model. Currently the agent combines those separate context messages into a single prompt but it probably should follow the pattern we use in OpenAI instrumentation of splitting them into multiple LlmChatCompletionMessage events (still one LlmChatCompletionSummary per API call)

To make things more complicated, the Messages API also allows messages to have sub-chunks to support multi-modal use-cases (Anthropic docs for how this works). For example, one conceptual multi-modal message might be What is this animal <picture of an elephant> and where does it live?. For these cases, if a single message has multiple chunks, this PR attempts to join those together into one string representation with reasonable placeholders for non-text components. Further down the road, the AI Monitoring team needs to do a bit of work to see if we can represent those non-text chunks as placeholders in a string. For now though, we just want something to show.

How to Test

The AI Monitoring sample app is running this version of the instrumentation for Anthropic models. The AI Monitoring team can help you out with where to see the data it produces

Related Issues

N/A

@bizob2828 bizob2828 self-assigned this Jan 13, 2025
Copy link
Member

@bizob2828 bizob2828 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR @RyanKadri! A few tiny suggestions but you also have merge conflicts so I didn't do a thorough review yet.

const chatCompletionMessage = new LlmChatCompletionMessage({
agent,
segment,
bedrockCommand,
bedrockResponse,
isResponse: true,
index: index + 1,
index: index,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
index: index,
index,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a little confusing. what about splitting vars so the prompt index can be promptIndex and then in here you have completionIndex, then this index is promptIndex + completionIndex

lib/instrumentation/aws-sdk/v3/bedrock.js Outdated Show resolved Hide resolved
@bizob2828 bizob2828 changed the title Support Claude 3+ Chat API in Bedrock feat: Added support for Claude 3+ Chat API in Bedrock Jan 13, 2025
@bizob2828
Copy link
Member

please run npm run lint:fix to fix your linting errors or address them individually by doing npx eslint --fix path/to/file.js

@RyanKadri
Copy link
Contributor Author

Sounds good 👍. I have a couple other changes incoming for another unit test scenario I want to cover

@RyanKadri
Copy link
Contributor Author

(Looking into unit tests)

@bizob2828
Copy link
Member

@RyanKadri is this ready for another review? You have merge conflicts again

@RyanKadri
Copy link
Contributor Author

Sorry. I got caught up in a couple other tasks. Still taking a look at the tests. I'll fix the merge conflicts as well

Comment on lines +199 to +200
embeddings.forEach(embedding => {
recordEvent({ agent, type: 'LlmEmbedding', msg: embedding })
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to do a bit more research on this but I think some of the Bedrock embedding models allow you to make a single invoke call that generates several embeddings (see this Cohere blog for an example). So I think it might be correct to allow unrolling one embedding command to several embedding events. Currently all the embedding models in the command class produce a single prompt but I'm wondering if the Cohere one is also incorrectly squashing messages in some cases.

I might try to treat that as a separate PR / task if you're alright with that though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case, how do you want to handle an error? I assume we still want one error attached to the transaction? I opted to only attach the embedding info if there's one event to keep the current behavior but I'm not sure that's correct

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know this enough, but seems ok for now

@RyanKadri
Copy link
Contributor Author

@bizob2828 I think I'm good for another review if you have a sec. I see some versioned tests are failing but they're complaining about an s3 client version and I don't think I touched anything that should break that?

Copy link
Member

@bizob2828 bizob2828 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good, just a few questions around some testing gaps

@@ -36,7 +36,7 @@ class LlmChatCompletionSummary extends LlmEvent {
const cmd = this.bedrockCommand
this[cfr] = this.bedrockResponse.finishReason
this[rt] = cmd.temperature
this[nm] = 1 + this.bedrockResponse.completions.length
this[nm] = (this.bedrockCommand.prompt?.length ?? 0) + this.bedrockResponse.completions.length
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on the bedrockCommand file this is always an array. so if the length is 0 it'll be 0, do we need this defaulting to 0 when prompt.length is falsey?

Comment on lines +199 to +200
embeddings.forEach(embedding => {
recordEvent({ agent, type: 'LlmEmbedding', msg: embedding })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know this enough, but seems ok for now

@@ -63,7 +65,7 @@ class BedrockResponse {
// Streamed response
this.#completions = body.completions
} else {
this.#completions = body?.content?.map((c) => c.text)
this.#completions = [stringifyClaudeChunkedMessage(body?.content)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see a versioned test but not a unit test for this

result = collected.join(' ')
return [{ role: 'user', content: this.#body.prompt }]
} else if (this.isClaudeMessagesApi() === true) {
return normalizeClaude3Messages(this.#body?.messages ?? [])
Copy link
Member

@bizob2828 bizob2828 Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a unit nor versioned tests that asserts that this.#body.messages is falsey and defaults to empty array, is this needed? if so, please write a unit test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was meant to handle the customer passing an invalid body. I realized the default value isn't needed here because I have it handled below anyway. I added a unit test for this case anyway to catch where it's handled lower down

@bizob2828
Copy link
Member

@bizob2828 I think I'm good for another review if you have a sec. I see some versioned tests are failing but they're complaining about an s3 client version and I don't think I touched anything that should break that?

I reviewed it. Yea, that s3 failure seems like an issue with npm and installing, it passes locally for me, if it's still failing after i re-ran, i can help dig

Copy link

codecov bot commented Jan 16, 2025

Codecov Report

Attention: Patch coverage is 98.46154% with 2 lines in your changes missing coverage. Please review.

Project coverage is 97.34%. Comparing base (46462d0) to head (340fecb).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
lib/llm-events/aws-bedrock/bedrock-command.js 96.36% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2870      +/-   ##
==========================================
- Coverage   97.39%   97.34%   -0.06%     
==========================================
  Files         308      310       +2     
  Lines       47330    47512     +182     
==========================================
+ Hits        46099    46252     +153     
- Misses       1231     1260      +29     
Flag Coverage Δ
integration-tests-cjs-18.x 73.02% <24.61%> (-0.16%) ⬇️
integration-tests-cjs-20.x 73.03% <24.61%> (-0.15%) ⬇️
integration-tests-cjs-22.x 73.05% <24.61%> (-0.16%) ⬇️
integration-tests-esm-18.x 49.78% <24.61%> (-0.05%) ⬇️
integration-tests-esm-20.x 49.79% <24.61%> (-0.05%) ⬇️
integration-tests-esm-22.x 49.83% <24.61%> (-0.05%) ⬇️
unit-tests-18.x 88.99% <79.23%> (-0.13%) ⬇️
unit-tests-20.x 88.99% <79.23%> (-0.13%) ⬇️
unit-tests-22.x 88.98% <79.23%> (-0.14%) ⬇️
versioned-tests-18.x 79.27% <95.38%> (-0.12%) ⬇️
versioned-tests-20.x 79.27% <95.38%> (-0.12%) ⬇️
versioned-tests-22.x 79.28% <95.38%> (-0.12%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

*/
function normalizeClaude3Messages(messages) {
const result = []
for (const message of messages ?? []) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm going to approve this but we can follow up on the missing coverage here: messages is falsey and there is an array of messages but one is null

@bizob2828 bizob2828 merged commit 6a83abf into newrelic:main Jan 17, 2025
26 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done: Issues recently completed
Development

Successfully merging this pull request may close these issues.

2 participants